Parse almost-any sacct output #101

aowenson-imm · 2023-12-19T15:32:00Z

Implements #100

This enables parsing almost-any sacct output. Summary:

auto-detect header in input CSV and extract field names
flexible SchedulerJobInfo creation

SchedulerJobInfo:

support more fields: user, state, timelimit, ru_ttime
modify to_dict() to return all fields, move *_dt pruning into fields()

SchedulerLogParser:

do not initialise output CSV in __init__(). Instead, do when writing output
modify _get_job_field_names() to accept job as optional argument, from which it takes field names
replace _write_csv_header() with _init_csv_output()

SlurmLogParser:

auto-detect if input CSV has header, if yes replace SLURM_ACCT_FIELDS, and skip header during parsing
redo _create_job_from_job_fields() to handle almost-any sacct output and create a SchedulerJobInfo()
add parse_jobs_to_dict(), an alternative to file output (so I can feed direct into Pandas)

Fix handling of Slurm jobs with an epilog. For me, these are categorised as FAILED because only the epilog is COMPLETED. So retain the last State value, not first.

This enables parsing almost-any sacct output. Summary: - auto-detect header in input CSV and extract field names - flexible SchedulerJobInfo creation SchedulerJobInfo: - support more fields: user, state, timelimit, ru_ttime - modify to_dict() to return all fields, move '*_dt' pruning into fields() SchedulerLogParser: - do not initialise output CSV in __init__(). Instead, do when writing output - modify _get_job_field_names() to accept job as optional argument, from which it takes field names - replace _write_csv_header() with _init_csv_output() SlurmLogParser: - auto-detect if input CSV has header, if yes replace SLURM_ACCT_FIELDS, and skip during parsing - redo _create_job_from_job_fields() to handle almost-any 'sacct' output and create a SchedulerJobInfo() - add parse_jobs_to_dict(), an alternative to file output Fix handling of Slurm jobs with a prolog. For me, these are categorised as FAILED because only the prolog part is COMPLETED. So retain the last State value, not first.

…celled before start)

aowenson-imm added 7 commits December 19, 2023 14:48

Map timelimit='UNLIMITED' to None instead of 365 days

dfdf926

Fix parsing jobs that did not specify memory

2c01a1a

Support parsing 'Reason' field

0fdd296

Parse 'node_list' field. Parse failed/cancelled jobs (except jobs can…

970e27c

…celled before start)

Fix parser returning None if any error, we want the successes

1db77bd

Support start/end missing time component- assume T0

8453783

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parse almost-any sacct output #101

Parse almost-any sacct output #101

aowenson-imm commented Dec 19, 2023 •

edited

Loading

Parse almost-any sacct output #101

Are you sure you want to change the base?

Parse almost-any sacct output #101

Conversation

aowenson-imm commented Dec 19, 2023 • edited Loading

aowenson-imm commented Dec 19, 2023 •

edited

Loading